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Abstract. Recent advances in experimental neuroscience allow, for the first 
time, non-invasive studies of the white matter tracts in the human central 
nervous system, thus making available cutting-edge brain anatomical data de- 
scribing these global connectivity patterns. This new, non-invasive, technique 
uses magnetic resonance imaging to construct a snap-shot of the cortical net- 
work within the living human brain. Here, we report on the initial success 
of a new weighted network communicability measure in distinguishing local 
and global differences between diseased patients and controls. This approach 
builds on recent advances in network science, where an underlying connectivity 
structure is used as a means to measure the ease with which information can 
flow between nodes. One advantage of our method is that it deals directly with 
the real-valued connectivity data, thereby avoiding the need to discretise the 
corresponding adjacency matrix, that is, to round weights up to 1 or down to 
0, depending upon some threshold value. Experimental results indicate that 
the new approach is able to highlight biologically relevant features that are not 
immediately apparent from the raw connectivity data. 



1. Motivation 

In rec ent years complex networks have receiv e d a significant amount of at- 
tention flAlbert fc Barabasil [200l . iNewmanI l2003l . IStroeatd l200l[ ). The need to 
study apparently disparate real-world networks using a single unified language 
has led to the growth of an interdisciplinary field that involves mathematicians, 
physicists, computer scientists, engineers and researchers from both the natural 
and social sciences. In this work we are interested in nature's most complex sys- 
tem, the human cerebral cortex (jSporns fc Zwill2004l ). Recent breakthroughs in 
diffusion magnetic resonance imaging (MRI) have enabled neuroscientists to con- 
struct connectivity matrices for the human brain and 'proof of principle' work has 
show n that existing bio logical knowledge can be recovered from this connectivity 
data flKlein et al.ll2007f ). 

Our ability to understand and compare different connectivity structures can 
be greatly facilitated by the introduction of easily computable measures that 
characterise the network topology. Typically, measures of this type rely heavily 
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on the idea that communication, to be understood here as the ease of infor- 
mation spread between nodes on the network, takes place along geodesies. How- 
ever, il l many real-world networks inf ormation can disseminate along non-shortest 
paths f Borgattil 20051 . Newman 2005 ) and for such networks any meaningful mea- 
sure of 'communicability' should account not only for the shortest path between 
two nodes, but als o all o ther possible routes. Motivated by this consideration, 
Estrada fc Hatanol ( 20081 ) recently advanced a new definition of communicabil- 



ity that takes non-shortest paths into account with an appropriate length-based 
weighting. This definition applies to networks with unweighted edges. In the 
case where the connectivity information is real-valued, converting this informa- 
tion into the required binary format is undesirable because (a) it requires a cutoff 
value to be determined and (b) fine details about connectivity strengths are lost. 

This report has two main aims: (i) development of a new, computable measure 
of connectivity for a weighted network, and (ii) application of this new measure 
to the case of cutting edge anatomical connectivity data for the brain. In §2] we 
extend the definition of communicability to the case of weighted networks, taking 
care to deal with the issue of normalisation. We then present a comparison of 
connectivity data for stroke patients and healthy control subjects in ^ 



2. Network Communicability 

Suppose we are given a network consisting of (a) a list of nodes and (b) a list 
telling us which pairs of nodes are connected. In the language of graph theory, 
this is an undirected, unweighted graph that could be defined in terms of the 
adjacency matrix A e M^^^, which has = aji = 1 if nodes i and j are 
connected and = otherwise. We w ill always set an = 0, so that self- 

links are disallowed. Estrada fc Hatano ( 20081 ) recently put forward the concept 



of communicability to address the issue that the existence or nonexistence of 
an edge does not necessarily capture the degree of "connectedness" between a 
pair of nodes. For example two nodes that are not themselves connected, but 
have many neighbours in common should be regarded as closer together than two 
unconnected nodes that can only be joined through a long chain of edges. An 
extremely useful observation is that if we raise the adjacency matrix to the kih 
power, then its i, jth element 



N N N 
ri=l r2=l »'fe=l 

counts the number of walks of length k that start at node i and finish at node 
j. Here the term walk refers to any possible traversal through the network that 
follows edges, and length refers to the number of edges involved. Estrada & 
Hatano argued that a level of communicability between two nodes could be as- 
signed by summing the number of walks of length 1,2,3,.... Because short walks 
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are more important than long walks, for example in a message-passing scenario 
shorter walks are faster and cheaper, to arrive at a single real number walks of 
length k are penalised by the factor l/{k\). This leads to a definition of com- 
municability between nodes i and j, for i ^ j, given by {YlT=i^''/(^^-))ij^ 
more compactly, exp{A)ij ( Estrada fc Hatanol 2008 ). We also note that in ad- 
dition to giving a neat characterisation in terms of the matrix exponential, the 
choice of sc aling factor k\ can also b e justified from the perspective of statistical 
mechanics testrada fc Hatanoll2007h . 

In our context, the connectivity information arises in the form of real- valued, 
non-negative weights, where a larger weight aij indicates that nodes i and j are 
more strongly connected. The identity ([T]) remains valid in this more general 
setting, but now the term ai^ri(iri,r20-r2,r3 ■ ■ ■ (^rk-i,rk(^rk,j does not give a zero/one 
contribution depending on whether the walk i ri ^-^ r2 ^ t—^ ■ ■ ■ \—>- Vk ^ j 
is possible. Instead it contributes the product of the weights along all the edges 
in the walk. 

Although it is appealing to use exp{A) in this way to define communicability 
for a weighted network, such a measure is likely to suffer from difficulties if the 
weights are poorly calibrated. A highly promiscuous node with large weights 
is liable to have an undue influence — similar effects have been observed in the 



context of spectral clustering (iHigham. Kalna &: Kibblell2007l ) and a natural nor- 
malisation that can be justified from first principles is to divide the weight 
by the product ^ydidj, where di := J2k=i is the generalised degree of node i. 
This leads us to define the communicability between distinct nodes i and j in a 
weighted network by 



(2) 



exp 



D-^AD- 



where the diagonal degree matrix D G M^^^ has the form D := diag((ii). 

In the next section we show that this new measure extracts useful information 
from brain connectivity networks. 

3. Brain Network 



3.1. Data and acquisition. As noted by lSporns et al.l (120051 ) . a major challenge 
facing any attempt to model the human brain using complex network theory is 
that the basic structural units of the brain, in terms of network nodes and links, 
are not well defined. Indeed, at least three levels of description are possible: (i) 
individual neurons and synapses (microscale); (ii) neuronal groups and popula- 
tions (mesoscale); and (iii) anatomically distinct brain regions and corresponding 
inter-regional pathways (macroscale) . In this work, due to the resolution limits of 
MRI data, we focus on the macroscale description of the human brain. We define 
a network using the Harvard-Oxford cortical and subco rtical structural atlases as 
implemented in fslview, part of FSL (jSmith et al.ll2004l ). thereby partitioning the 
brain into 56 anatomically distinct regions — 48 cortical and 8 subcortical. This 
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Figure 1. Components corresponding to stroke patients are la- 
beled with crosses and circles denote controls. Left: components of 
the right singular vector, of the original data matrix. Centre: 

components of the scaled right singular vector D^igi^t^^'^^- Right: 
components of the second right singular vector, v^^l, of the data 
matrix after post-processing using communicability. 



produces a weighted, undirected graph with 56 nodes. In our experiments, we 
have data for 9 stroke patients (at least six months following first, left hemisphere, 
subcortical stroke) and 10 age matched controls. 

A more detailed description of the materials and methods is provided online; 
see Appendix A. 

3.2. Spectral clustering. We set ourselves the task of unsupervised clustering 
of the patients, to check how accurately we can recover the known stroke/control 
groupings. A patient data set consists of (56^ — 56)/2 = 1540 distinct values, 
giving the connectivity strength between each pair of distinct brain regions. We 



then used each data set to create a column of a matrix W G 



s 1540x19 



SO that 



gives the connectivity strength for the ith pair of brain regions in patient j. 
Unsupervised clustering on the 19 colum ns of this matrix w as performed using 
the singular value decomposition (SVD) ( Higham et al.l 2007 ). This approach is 
closely related to many other techniques, such as Principle Component Analysis, 
support vector ma chines /kernel based methods, n i achine learning and multidi- 
mensional scaling ( Cox fc Cox 19941 . iMacKavl [20031 . ISkillicornllSzh . 



The second right singular vector, v 



[2] 
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can be used to assign a value 



(v'^^) . to the jth patient, and the aim is that patients with similar connectivity 
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profiles will be assigned nearby values. The left hand picture in Figure [T] shows 
the values of plotted in increasing order. Components corresponding to 
stroke patients are labeled with crosses and circles denote controls. We see from 
the picture that the SVD has placed the strokes and controls in order, with the 
exception that a stroke and control (in positions 9 and 10) have been misordered. 
The middle picture in Figure [1] shows the corresponding plot when the SVD is 

applied to the normahsed data matrix D^^^lWD^■^^^^^, with (Acft), := J2j=i'^ij 

and (-Dright)j := '^l^i Wij, and the normahzed left singul ar vector D^-Xt^^"^^ i s 
displayed, as discussed for the case of microarray data in (IHigham et al.l 120071 ). 
We see that the classification is improved by the normalisation process. Closer 
inspection of the raw data showed that for the two patients that were originally 
ordered incorrectly, one had unusually large and the other had unusually small 
overall connectivity weights, (-Dright)^; this is precisely the situation where nor- 
malisation is designed to be beneficial. 



3.3. Communicability. We motivated the new weighted communicability mea- 
sure by arguing that the higher order terms in the power series of equation ([2]) con- 
tain important additional information. We now provide evidence that weighted 
communicability does indeed add value to the raw data. 



3.3.1. Spectral clustering revisited. We start by repeating the unsupervised clus- 
tering task of the previous section for the new data matrix, C G ]Ri540xi9^ whose 
columns are constructed from the respective communicability networks, so that 
Cij gives the communicability strength for the ith pair of brain regions in patient 
j . The right hand plot in Figure [1] shows the values of the second right singular 
vector, plotted in increasing order. We see that post-processing the data 
using communicability significantly improves the results of the clustering algo- 
rithm, giving a clearer separation than the unnormalised and normalised versions 
based on the raw data. It also gives the aesthetically pleasing result that the two 
clusters have opposite signs; negative for strokes and positive for controls. Using 
the second left singular vector, u^^^, we may proceed to identify those connections 
that enable us to distinguish between stroke and control classes; further details 
are provided in the supplementary material. 



3.3.2. Statistical Validation. To quantify the effect of using weighted communi- 
cability, we applied the niean-centred partial least s quares (PLS) approach of 
Mcintosh and colleagues flMcIntosh fc Lobaughll2004f ). Via the SVD, PLS anal- 
ysis returns latent variable pairs (left/right singular vectors containing the con- 
nection/group saliences) which describe a particular pattern of connectivity co- 
variance according to subject. The statistical significance of each latent variable 
was determined using permutation tests of 500 permutations, whilst the relia- 
bility of saliences of the individual connections in contributing to the pattern of 
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covariance identified by the latent variables was determined using 100 bootstrap 

analyses. 

The PLS analysis returned one significant {p < 0.01) latent variable pair for 
each of the three data sets described above. In each case PLS was able to distin- 
guish between stroke and control classes, however, this should not be to surpris- 
ing since PLS is a supervised method. Perhaps more importantly, the number of 
connections which returned saliences in the 99th percentile was greatest for com- 
municability (318), then the normalised data (290) and lowest in the raw data 
(266); suggesting that communicability has the effect of reducing the influence of 
noise in the data. 

4. Conclusion 

Our new network measure extends the concept of communicability in a natural 
manner to the case of weighted networks. Initial tests on cutting-edge anatomical 
brain connectivity data show that this measure can give statistically significant 
enhancement to the performance of standard data analysis tools. 
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Appendix A. Supplementary data 

Supplementary data associated with this article can be found at 
http : //www . maths . strath . ac . uk/~gcb07 174/ crof ts/rs/rsoc08_supp . html 
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